85 research outputs found
An Efficient FPGA-based Accelerator for Deep Forest
Deep Forest is a prominent machine learning algorithm known for its high
accuracy in forecasting. Compared with deep neural networks, Deep Forest has
almost no multiplication operations and has better performance on small
datasets. However, due to the deep structure and large forest quantity, it
suffers from large amounts of calculation and memory consumption. In this
paper, an efficient hardware accelerator is proposed for deep forest models,
which is also the first work to implement Deep Forest on FPGA. Firstly, a
delicate node computing unit (NCU) is designed to improve inference speed.
Secondly, based on NCU, an efficient architecture and an adaptive dataflow are
proposed, in order to alleviate the problem of node computing imbalance in the
classification process. Moreover, an optimized storage scheme in this design
also improves hardware utilization and power efficiency. The proposed design is
implemented on an FPGA board, Intel Stratix V, and it is evaluated by two
typical datasets, ADULT and Face Mask Detection. The experimental results show
that the proposed design can achieve around 40x speedup compared to that on a
40 cores high performance x86 CPU.Comment: 5 pages, 5 figures, conferenc
In-Domain GAN Inversion for Faithful Reconstruction and Editability
Generative Adversarial Networks (GANs) have significantly advanced image
synthesis through mapping randomly sampled latent codes to high-fidelity
synthesized images. However, applying well-trained GANs to real image editing
remains challenging. A common solution is to find an approximate latent code
that can adequately recover the input image to edit, which is also known as GAN
inversion. To invert a GAN model, prior works typically focus on reconstructing
the target image at the pixel level, yet few studies are conducted on whether
the inverted result can well support manipulation at the semantic level. This
work fills in this gap by proposing in-domain GAN inversion, which consists of
a domain-guided encoder and a domain-regularized optimizer, to regularize the
inverted code in the native latent space of the pre-trained GAN model. In this
way, we manage to sufficiently reuse the knowledge learned by GANs for image
reconstruction, facilitating a wide range of editing applications without any
retraining. We further make comprehensive analyses on the effects of the
encoder structure, the starting inversion point, as well as the inversion
parameter space, and observe the trade-off between the reconstruction quality
and the editing property. Such a trade-off sheds light on how a GAN model
represents an image with various semantics encoded in the learned latent
distribution. Code, models, and demo are available at the project page:
https://genforce.github.io/idinvert/
LinkGAN: Linking GAN Latents to Pixels for Controllable Image Synthesis
This work presents an easy-to-use regularizer for GAN training, which helps
explicitly link some axes of the latent space to a set of pixels in the
synthesized image. Establishing such a connection facilitates a more convenient
local control of GAN generation, where users can alter the image content only
within a spatial area simply by partially resampling the latent code.
Experimental results confirm four appealing properties of our regularizer,
which we call LinkGAN. (1) The latent-pixel linkage is applicable to either a
fixed region (\textit{i.e.}, same for all instances) or a particular semantic
category (i.e., varying across instances), like the sky. (2) Two or multiple
regions can be independently linked to different latent axes, which further
supports joint control. (3) Our regularizer can improve the spatial
controllability of both 2D and 3D-aware GAN models, barely sacrificing the
synthesis performance. (4) The models trained with our regularizer are
compatible with GAN inversion techniques and maintain editability on real
images
Node.js scalability investigation in the cloud
Node.js has gained popularity in cloud development due to its asynchronous, non-blocking and event-driven nature. However, scalability issues can limit the number of concurrent requests while achieving an acceptable level of performance. To the best of our knowledge, no cloud-based benchmarks or metrics focusing on Node.js scalability exist. This paper presents the design and implementation of Ibenchjs, a scalability-oriented benchmarking framework, and a set of sample test applications. We deploy Ibenchjs in a local and isolated cloud to collect and report scalability-related measurements and issues of Node.js as well as performance bottlenecks. Our findings include: 1) the scaling performance of the tested Node.js test applications was sub-linear; 2) no improvements were measured when more CPUs were added without modifying the number of Node.js instances; and 3) leveraging cloud scaling solutions significantly outperformed Node.js-module-based scaling
Experimental Study on the Thermal Response of PCM Energy Storage Block with Hole Ventilation
Under the condition of Nanjing, the effect by the velocity variation of night ventilation on the thermal response of the south wall built by phase-change materials (PCMs) blocks with different configurations has been investigated and analyzed. It shows that the thermal performance when the PCM is placed nearby inner side in hollow block is better than that of the outer side. Meanwhile, the maximum amplitude of the temperature on the interior surface when the PCM is placed at the inner side is 58.3% higher than that of the outer side. The optimal flow velocity of both A and B is 2 m/s. Meanwhile, the minimum amplitudes of the temperature on the interior surface are 1.74°C and 3.72°C as well as the retardation coefficients are 8 h and 7 h. Compared to the structure configuration without ventilation, the heat flow was reduced 38.2% and 29.3%, respectively, and the equivalent heat resistance increased by 115.8% and 88.6%
- …